HMM state clustering across allophone class boundaries
نویسندگان
چکیده
We present a novel approach to hidden Markov model (HMM) state clustering based on the use of broad phone classes and an allophone class entropy measure. Most state-of-the-art largevocabulary speech recognizers are based on context-dependent (CD) phone HMMs that use Gaussian mixture models for the state-conditioned observation densities. A common approach for robust HMM parameter estimation is to cluster HMM states where each state cluster shares a set of parameters such as the components of a Gaussian mixture model. In all the current state clustering algorithms, the HMM states are clustered only within their respective allophone classes. While this makes some intuitive sense, it prevents the clustering of states across allophone class boundaries, even when the states are acoustically similar. Our algorithm allows clustering across allophone class boundaries by defining broad phone groups within which two states from different allophone classes can be clustered together. An allophone class entropy measure is used to control the clustering of states belonging to different allophone classes. Experimental results on three test sets are presented.
منابع مشابه
Feature-dependent allophone clustering
We propose a novel method for clustering allophones called Feature-Dependent Allophone Clustering (FD-AC) that determines feature-dependent HMM topology automatically. Existing methods for allophone clustering are based on parameter sharing between the allophone models that resemble each other in behaviors of feature vector sequences. However, all the features of the vector sequences may not ne...
متن کاملExtensions to phone-state decision-tree clustering: single tree and tagged clustering
The following article describes two extensions to the \traditional" decision tree methods for clustering allophone HMM states in LVCSR systems. The rst, single tree clustering, combines all allophone states of all phones into a single tree. This can be used to improve performance for very small systems. The single tree clustering structure can also be exploited for speaker and channel adaptatio...
متن کاملTaking Advantage of Spanish Speech Resources to Improve Catalan Acoustic HMMs
At TALP, we are working on speech recognition of official languages in Catalonia, i.e. Spanish and Catalan. These two languages share approximately 80 % of their allophones. The speech databases that we have available to train HMMs in Catalan have a smaller size than the Spanish databases. This difference of size of training databases results in poorer phonetic unit models for Catalan than for ...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملNovel entropy based moving average refiners for HMM landmarks
The training of precise speech recognition models depends on accurate segmentation of the phonemes in a training corpus. Segmentation is typically performed using HMMs, but recent speech recognition work suggests that the transient acoustic features characteristic of manner-class phoneme boundaries (landmarks) may be more precisely localized using acoustic classifiers specifically designed for ...
متن کامل